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0.   INTRODUCTION 

Syntax-directed  translation  schema  (SDTS)  coupled  with  LR  parsing 
is  a  commonly  used  technique  for  recognizing  and  translating  a  large 
class  of  LR  languages.   However,  this  scheme  often  leads  to  a  prohib- 
itively large  LR  parsing  table.   This  paper  presents  an  extended  SDTS 
(ESDTS)  which  can  aid  in  reducing  the  size  of  the  parsing  table. 

Part  I  presents  several  examples  that  are  intended  to  familiarize 
the  reader  with  the  notation  and  conventions  used  throughout  the  paper, 
Part  II  discusses  the  motivation  behind  the  ESDTS:   how  SDTS  often  re- 
quires excessive  stratification  of  the  grammar  in  order  to  implement 
language  semantics.   Part  III  develops  the  extended  scheme  itself  and 
suggests  a  possible  implementation.   Part  IV  analyzes  implementation 
aspects  in  order  to  compare  ESDTS  with  SDTS. 


I.  NOTATION 


In  order  to  familiarize  the  reader  with  the  conventions  and 
notation  used  throughout  this  paper,  several  examples  are  presented 
before  beginning  discussion  of  the  translation  scheme.   For  a  more 
detailed  description  of  the  theories  of  LR  parsing,  syntax  directed 
translation  schema  and  related  algorithms,  see  [AU72], 


Grammar  1:    ASSIGN 
EXPR 

ELEM 


=  id  j^=  EXPR 

=  ELEM 

=  EXPR  o£  ELEM 

=  id. 

=  number 


In  this  grammar  the  set  of  terminal  symbols  is  {_id,  number ,  :  =  , 
op_};   the  set  of  non-terminals  is  {ASSIGN,  EXPR,  ELEM}.   The  terminal 
symbol  op_  is  a  token  class  representing  +  ,  -,  *,  and  /  in  the  same  way 
that  id  and  number  represent  the  token  classes  of  identifiers  and 
constants,  respectively.   The  mapping  of  token  to  terminal  takes  place 
during  the  lexical  analysis  phase  so  that  the  parser  never  sees  the 
specific  token;   however,  this  information  is  available  for  use  by 
the  semantic  routines.   The  language  generated  by  Grammar  1  is  the 
set  of  assignment  statements  where  the  operators  have  simple  left-to- 
right  associativity;   e.g.  A:=B*C+12/D  and  Y:  =Z-7*Y/U+X.   The  LR(l) 
table  for  recognizing  this  language  is  given  in  Figure  1. 


The  LR(l)  parsing  algorithm  is  as  follows: 

Algorithm  1 

INPUT:   LR(l)  parsing  table  PARSEJTABLE  and  string  to  be  parsed. 
OUTPUT:   YES,  if  string  £  L(G);   otherwise  ERROR. 

1.  Initialize  state  stack  to  TO. 

2.  Get  the  next  incoming  token  x;   if  the  input  string  is  now 
empty,  the  token  is  e. 

3.  Let  Tn  be  the  state  on  the  top  of  the  state  stack. 

a.  If  PARSE_TABLE(Tn,x)=Sm  then  shift:   push  Tm  onto  the 
state  stack  and  return  to  step  2. 

b.  If  PARSEJTABLE (Tn,x)=Ri  then  reduce:   if  production 

rule  i  is  A::=a,  then  pop  |a|  states  off  the  state  stack, 
exposing  a  new  state  Tm  on  the  top  of  the  stack.   If 
PARSE_TABLE(Tm,A)=Sj  then  push  Tj  onto  the  state  stack 
and  return  to  step  2;   otherwise  halt  with  an  ERROR. 

c.  If  PARSE_TABLE(Tn,x)=A  then  accept:   halt  and  declare 
the  input  string  valid  by  outputting  YES. 

d.  If  PARSEJTABLE (Tn,x)=E  then  error:   halt  with  output  of 
ERROR  (or  transfer  to  an  error  recovery  routine). 

A  PL/1  implementation  of  Algorithm  1  is  given  in  Appendix  A. 

A  translation  scheme,  in  general,  is  a  means  of  mapping  a  given 
input  string  to  an  output  string.   A  syntax-directed  translation  scheme 
(SDTS)  performs  this  mapping  by  associating  a  partial  translation  func- 
tion with  each  production  rule  in  a  grammar.   Whenever  a  production  rule 
is  used  in  the  derivation  of  the  input  string,  its  respective  translation 


TERMINALS 


NONTERMINALS 


STATES 


TO 
Tl 
T2 
T3 
Tk 
T5 
t6 
TT 
T8 
T9 


id 


number 


1Z  °£ 


ASSIGN  EXPR 


ELEM 


SI 

S9 

S2 

S5 

Sb 

S3 

SU 

ST 

Rl 

R2 

R2 

RU 

Rk 

R5 

R5 

S5 

s6 

S8 

R3 

R3 

A 

Sn  neans  shift  incoming  token  and  goto  state  n. 

Rn  means  reduce  "by  production  rule  n. 

A  means  accept. 

Blanks  are  error  entries. 


Grammar  1: 

1 

ASSIGN  : 

:  =  id  \^_   EXPR 

2 

EXPR  : 

:=  ELEM 

3 

:=  EXPR  og_   ELEM 

h 

ELEM  : 

:=  id 

5 

:=  number 

Grammar  1  and  its  LR(l)  table, 


Figure  1. 


function  is  applied  to  compute  a  portion  of  the  output  string.   A 
trivial  example  of  an  SDTS  would  be  outputting  the  number  of  a  pro- 
duction rule  when  it  is  used  in  a  reduction;   the  output  generated 
is  the  right  parse  (the  reverse  sequence  of  productions  used  in  the 
rightmost  derivation)  of  the  input  string. 

To  illustrate,  define  a  configuration  of  the  parser  as  a 
triple  of  (state  stack,  remaining  input  string,  output  string) 
corresponding  to  step  3  of  Algorithm  1.   Then  given  Algorithm  1 
with  input  of  the  table  in  Figure  1  and  string  Z:=X/Y*5  and  the 
translation  described  above,  the  parser  would  take  on  the  following 
sequence  of  configurations: 

(TO,  Z:=X/Y*5,  e)  ■+  (T0T1,  :=X/Y*5,  e) 

-*  (T0T1T2,  X/Y*5,  e) 

->  (T0T1T2T5,  /Y*5,  h) 

+  (T0T1T2TU,  /Y*5,  h) 

■*■  (T0T1T2T3,  /Y*5,  h2) 

+  (T0T1T2T3TT,  Y*5,  k2) 

-*■  (T0T1T2T3TTT5,  *5,  h2) 

-»■  (T0T1T2T3T7T8,  *5,  U2U ) 

->  (T0T1T2T3,  *5,  U2U3) 

-»-  (T0T1T2T3T?,  5,  ^2U3) 

■>  (T0T1T2T3TTT6,  e,  U2U3) 

-»■  (T0T1T2T3TTT8,  e,  U2U35) 

■»■  (T0T1T2T3,  e,  U2U353) 

■*■  (T0T9,  e,  1+2U3531) 

-*■  accept  and  halt 

The  translation  defined  above  is  not  particularly  useful  in  a 
compiler;  more  often  the  translation  generated  is  some  type  of 
assembly  code  or  intermediate  text  which  indicates  in  detail  the 


"meaning"  of  the  input  string.  In  future  examples  the  code  that 
will  be  generated  by  the  translation  will  he  assembly  code  for  a 
(fictitious)  stack  machine.   The  instructions  and  their  meanings  are: 


PUSH  number 
PUSH  #id 
FETCH 


SADD 
SSUB 
SMUL 
SDIV 


POP 


STORE 


push  value  on  run-time  stack 

push  address  of  id_  on  stack 

replace  the  address  on  the  top  of  the 
stack  with  "che  value  at  that  location 

stack  operations:  if  the  values  of  X 
and  Y  are  on  the  top  of  the  stack, 
executing  one  of  these  commands  pops 
both  values  off  the  stack,  and 
pushes  the  value  of  X  op  Y 

pop  a  value  and  an  address  off  the 
top  of  the  stack  and  store  the  value 
at  that  address 

same  as  POP  except  that  the  value  is 
returned  to  the  top  of  the  stack 


For  example,  the  translation  of  Z:=X/Y*5  would  be: 

PUSH  ffZ 
PUSH  #X 
FETCH 
PUSH  #Y 
FETCH 
SDIV 
PUSH  5 
SMUL 
POP 

The  term  "semantics"  is  generally  used  for  associating  a  translation 

or  "meaning"  with  an  input  string.   Throughout  this  paper  the  term 

"semantic  actions"  will  encompass  not  only  the  explicit  outputting 

of  a  translation,  but  will  also  include: 

-  any  operations  necessary  to  compute  the  translation 

-  bookkeeping  tasks  for  the  compiler 

-  error  checking/correction 


II .   MOTIVATION 

Although  LR  parsing  coupled  with  SDTS  is  a  fairly  elegant 
technique  for  recognizing  and  translating  a  large  class  of  LR 
languages,  a  serious  implementation  problem  remains:   the  LR 
table  for  even  a  moderately  complex  language  can  become  prohibi- 
tively large.   One  factor  contributing  to  large  table  size  is  that 
an  SDTS  often  leads  to  excessive  stratification  of  the  grammar  in 
order  to  accommodate  semantic  analysis.   There  are  two  types  of 
stratifications  generated:   those  resulting  from  the  linking  of 
semantic  actions  with  production  rules,  and  those  resulting  from 
reliance  on  strictly  local  context  to  perform  semantics. 

If  a  particular  semantic  action  is  required  at  a  given  point  in 
the  parsing  of  a  string,  it  must  be  guaranteed  that  a  reduction  take 
place  at  that  point  since  a  semantic  action  in  an  .SDTS  may  only  be 
performed  when  a  reduction  takes  place.   For  instance,  in  Grammar  1 
if  it  were  necessary  to  perform  some  semantic  action  when  the  id_  of 
ASSIGN  ::=  id_  j_j^  EXPR  was  recognized,  a  production  would  have  to  be 
forced  there  using  productions  such  as 

ASSIGN  ::=  VAR  j=_  EXPR 
VAR  : : =  id 

The  semantic  action  could  then  be  accomplished  when  the  reduction 
VAR  ::=  id.  took  place.   Similarly,  requiring  an  action  when  op_  is 


encountered  might  be  realized  by: 

EXPR  : : =  EXPR  OP  ELEM 
OP  :  :=  op 

Grammar  1  extended  in  this  manner,  its  LR(l)  table,  and  sample 
semantic  actions  are  shown  in  Figure  2. 

The  productions  added  in  this  way  are  "useless"  in  that  they 
contribute  no  information  to  the  parse,  but  are  present  only  to 
provide  a  means  for  semantic  analysis.   They  do,  unfortunately, 
add  to  the  size  of  the  LR  table:   each  supplementary  rule  requires 
a  new  nonterminal  (another  column  in  the  table)  and  adds  an  extra 
state  (another  row  in  the  table).   The  number  of  entries  in  the  LR 
table  for  Grammar  IE  represents  an  increase  of  50%  over  the  table 
for  Grammar  1. 

The  second  type  of  stratification  results  from  the  dependency 
of  SDTS  on  strictly  local  context;   that  is,  in  performing  a  semantic 
action  in  an  SDTS  the  only  context  information  available  is  which 
production  rule  is  currently  being  used  in  the  reduction.   In  many 
cases,  however,  the  semantic  action  does  not  depend  totally  on  the 
local  configuration,  but  also  on  the  global  context  in  which  this 
configuration  is  embedded. 

For  example,  a  common  situation  is  the  following: 


Grammar  2:   DECLARE 

INPUT 

IDLIST 


=  typet  IDLIST 
=  INPUT  IDLIST 
=  id 


=  IDLIST  j_  id 


t  the  terminal  type  is  again  a  token  class  representing  {INTEGER, 
REAL ,  BOOLEAN ,  . . . } . 


id   number 


1Z.  2£L 


ASSIGN   EXPR   ELEM 


VAR    01' 


TO 

S10 

S9 

SI 

Tl 

S2 

T2 

S5 

S6 

S3 

Sk 

T3 

Sll 

Rl 

ST 

TU 

R2 

R2 

T5 

rU 

RU 

T6 

R5 

R5 

TT 

S5 

s6 

s8 

T8 

R3 

R3 

T9 

A 

T10 

R6 

Til 

RT 

RT 

Grammar  IE 

1  ASSIGN  : 

:  = 

VAR  ^   EXPR    : 

2   EXPR  : 

:  = 

ELEM 

3       : 

:  = 

EXPR  OP  ELEM   : 

k        ELEM  : 

:  = 

id           : 

5        : 

:  = 

number        : 

6    VAR  : 

:  = 

id           : 

7     OP  : 

:  = 

°E           : 

Semantic  Actions: 
Generate  POP 

Pop  operator  stack  and  generate  SADD, 

SSUB,  SMUL,  SDIV  appropriately 

Generate  PUSH  #id,  FETCH 

Generate  PUSH  number 

Generate  PUSH  #id 

Push  op  code  onto  operator  stack 


Grammar  IE,  its  associated  semantics  and  LR  table. 

Figure  2. 
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This  language  contains  strings  such  as  INTEGER  A,B,C  or  INPUT  X,Y. 
Clearly  different  semantics  should  he  associated  with  the  identifiers 
appearing  in  a  declaration  list  than  with  the  identifiers  in  an 
input  list.   In  the  former  case  the  type  of  id_  might  he  entered  in 
the  symbol  table;   in  the  latter  case  a  READ  id.  might  he  generated, 
and  the  id_  might  be  checked  to  determine  if  it  had  been  declared. 
Performing  the  appropriate  semantic  action  does  not  depend  on  the 
fact  that  the  id_  is  contained  in  an  IDLIST  hut  rather  depends  on 
whether  the  IDLIST  is  associated  with  a  declaration  or  input  statement. 
In  order  to  correctly  perform  the  semantic  actions  using  SDTS,  the 
grammar  must  be  stratified  in  such  a  way  that  each  production  rule 
indicates  its  global  context.   One  way  to  accomplish  this  is: 


Grammar  2E:  DECLARE 

INPUT 

DECLIST 

INLIST 


=  type  DECLIST 
=  INPUT  INLIST 


=  id 

=  DECLIST 

=  id 


id 


=  INLIST  x  id 


This  type  of  stratification  results  in  redundant  production  rules, 
and  is  even  more  costly  in  terms  of  table  entries  than  the  first  type, 
Not  only  is  a  new  nonterminal  added,  but  more  than  one  extra  state  is 
required  (in  this  example,  3  extra  states  were  needed). 

Another  example  of  this  situation  is  the  problem  of  embedded 
assignment  statements.   Grammar  IE  can  be  modified  to  include  this 
feature: 
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Grammar  3 : 

ASSIGN  : 

•  = 

VAR  jj=  EXPR 

EXPR  : 

:  = 

ELEM 

:  = 

EXPR  OP  ELEM 

ELEM  : 

:  = 

id 

:  = 

number 

:  = 

i  ASSIGN  j_ 

VAR  : 

:  = 

id 

OP  : 

:  = 

2R 

»+-' 


$ 


^' 


This  language  includes  strings  such  as  A:=B+(C:=D/2)-E  and 
W:=(Y:=Z*(X:=X+l))/3  . 

When  the  reduction  ASSIGN  ::=  VAR  jf«  EXPR  takes  place,  a  POP 
is  generated,  which  will  pop  both  the  expression  value  and  the  address 
of  the  variable  off  the  run-time  stack.   However,  if  this  is  an  embed- 
ded assignment  it  is  necessary  to  retain  the  expression  value  on  the 
top  of  the  stack  since  it  will  be  required  for  further  computation; 
this  is  the  action  produced  by  a  STORE  instruction.   Therefore,  a 
STORE  should  be  generated  when  reducing  an  embedded  assignment,  and 
a  POP  generated  only  for  non-embedded  assignments.   A  sample  extension 
of  Grammar  3  to  accommodate  these  semantics  is: 


Grammar  3E : 

ASSIGN  : 

;  = 

VAR  ±  =   EXPR 

EXPR  : 

:  = 

ELEM 

:  = 

EXPR  OP  ELEM 

ELEM  : 

:  = 

id 

:  = 

number 

:  = 

{_  EASSIGN  ]_ 

VAR  : 

:  = 

id 

OP  : 

:  = 

22. 

EASSIGN  : 

;  = 

VAR  :=  EXPR 

12 


Figures  3  and  h   show  the  LR  tahles  for  grammars  3  and  3E  respectively, 
The  number  of  table  entries  in  the  table  for  Grammar  3E  represents  a 
30%  increase  over  the  number  of  entries  in  the  table  for  Grammar  3. 
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id  numbr 


°£ 


ASSGN   EXFR  ELEM  VAR 


OP 


TO 

Tl 
T2 
T3 
TU 

T5 

T6 

TT 

T8 

T9 

TIO 

Til 

T12 

T13 

TlU 


S2 

Silt 

SI 

S3 

RT 

s6 

ST 

S8 

sk 

S5 

Rl 

SIO 

Rl 

S9 

R2 

R2 

R2 

Rh 

nk 

Rl+ 

R5 

R5 

R5 

S2 

Sll 

SI 

s6 

S7 

S8 

S12 

R8 

R8 

R8 

S13 

R3 

R3 

R3 

R6 

R6 

R6 

A 

Grammar  3: 

1 

ASSIGN  : 

:  = 

VAR  y^   EXPR 

2 

EXPR  : 

:  = 

ELEM 

3 

:  = 

EXPR  OP  ELEM 

h 

ELEM  : 

:  = 

id 

5 

:  = 

number 

6 

:  = 

{_  ASSIGN  ^ 

7 

VAR  : 

:  = 

id 

8 

OP  : 

:  = 

2£ 

Grammar  3  and  its  LR  table, 
Figure  3. 


id  numbr 


22. 


ASSGN  EXPR   ELEM  VAR 


11+ 

OP  EASGN 


S2 

siU 

SI 

S3 

RT 

s6 

ST 

S8 

sU 

S5 

SIO 

Rl 

S9 

R2 

R2 

R2 

Rk 

Rk 

rU 

R5 

R5 

R5 

S2 

SI  5 

Sll 

s6 

S7 

S8 

S12 

R8 

R8 

r8 

S13 

R3 

R3 

R3 

R6 

R6 

R6 

A 

Sl6 

s6 

ST 

s8 

SIT 

S5 

R9 

SIO 

S9 

Grammar  3E: 


1  ASSIGN 

2  EXPR 
3 


ELEM 


VAR 
OP 


9  EASSIGN 


=  VAR  _j  =  EXPR 

=  ELEM 

=  EXPR  OP  ELEM 

=  id 

=  number 

=  {_   EASSIGN  )_ 

=   id_ 

=  ££ 

=  VAR  :=  EXPR 


Semantics : 
Generate  POP 

Pop  operator  stack  and  generate  SADD, 
SSUB,  SMIIL,  or  SDIV  appropriately 
Generate  PUSH  #id,  FETCH 
Generate  PUSH  number 

Generate  PUSH  #id 

Push  op  code  onto  operator  stack 

Generate  STORE 


Grammar  3E,  its  associated  semantics  and  LR  table. 

Figure  h. 
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III.   THE  TRANSLATION  SCHEME 

As  shown  in  the  previous  section,  the  use  of  SDTS  to  incorporate 
semantic  analysis  into  an  LR  parse  is  often  inconvenient  and  costly. 
The  question  now  is  how  can  semantic  actions  he  performed  as  required 
without  stratifying  the  grammar. 

Eliminating  Type  1  St rat if i cat  ions 

One  way  to  extend  the  SDTS  in  order  to  eliminate  the  first  type 
of  stratification  is  to  relax  the  constraint  that  semantic  actions  he 
associated  only  with  reductions.   In  fact,  why  not  allow  semantic  actions 
to  he  performed  at  any  point  of  the  parse  —  not  only  on  reduction,  hut 
also  on  shift,  accept  and  error?t   In  order  to  see  how  this  might  he 
done,  it  is  helpful  to  clarify  what  is  actually  occurring  in  an  SDTS. 

To  carry  out  the  translation,  the  normal  SDTS  uses  internal  infor- 
mation from  the  parse,   the  n  in  an  Rn  entry  of  the  LR  table.   This 
number  actually  communicates  two  pieces  of  information:  what  production 
rule  is  to  be  applied  (for  the  parse)  and,  since  the  semantic  actions 
have  the  same  numbers  as  the  production  rules,  what  semantic  action  is 
to  be  performed  (for  the  translation).   This  duality  is  lacking  in  the 
other  types  of  table  entries  (S,  E,  and  A),  so  packing  two  pieces  of 
information  into  each  R  entry  causes  stratification  by  forcing  a 
reduction  whenever  a  semantic  action  is  required. 

t  Professor  T.  R.  Wilcox  was  instrumental  in  the  choice  of  this 
attack  on  the  problem. 
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With  this  clarification,  a  solution  is  apparent.   First,  split 
this  single  item  that  serves  dual  purposes  into  two  separate  items, 
and  second,  since  reduce  entries  now  explicitly  contain  semantic 
information,  modify  the  LR  table  so  that  all  entries  carry  semantic 
information.   Therefore,  instead  of  each  table  entry  containing  only 
two  pieces  of  information  (the  type  of  parsing  action  and  its  associ- 
ated number),  each  entry  now  contains  a  third  piece  of  information, 
the  number  of  the  semantic  action  to  be  performed: 


type  of 

parsing 

action 


parsing 
action 
# 


semantic 

action 

ft 


Shift 
Reduce 
Error 
Accept 


(goto)  state  ft 
prod,  rule  ft 
error  ft 


By  thus  disassociating  the  semantic  actions  from  the  production 
rules  (and  numbering  them  independently)  and  embedding  the  appropriate 
semantic  action  number  directly  in  the  LR  table  entry  that  corresponds 
to  the  point  in  the  parse  where  that  action  is  to  take  place,  semantic 
actions  can  be  performed  at  any  point  of  the  parse.   Using  this  config- 
uration, an  entry  notation  R3/5  means  reduce  by  rule  3  and  then  perform 
semantic  action  5;   S10/8  means  shift,  perform  semantic  action  8,  then 
goto  state  10.   Figure  5  shows  how  the  semantics  of  grammar  IE  can  be 
directly  incorporated  into  the  LR  table  of  Grammar  1;   the  stratifica- 
tions are  no  longer  necessary  to  correctly  include  semantic  analysis. 


Eliminating  Redundant  Semantic  Code 

"   ■  ^  'i    w     «    —    ^    *    ■     mm>       —  ^f^.  m  <m    m     ^  ■  -       ■■»■■■■■■      —  n  *m    m    n 

It  is  worth  noting  that  in  Figure  5  semantic  action  3  is  actually 
composed  of  two  more  basic  actions,  generate  PUSH  ft  id   and  generate 


IT 


id   number 


°E. 


e   ASSIGN  EXPR   ELEM 


TO 
Tl 

T2 
T3 
Th 
T5 
T6 
TT 
T8 
T9 


Sl/5 

S9 

S2 

S5 

s6 

S3 

Si+ 

ST/6 

Rl/1 

R2 

R2 

RU/3 

RU/3 

R5A 

R5A 

S5 

s6 

s8 

R3/2 

R3/2 

A 

Grammar  1: 

1  ASSIGN 

2  EXPR 


ELEM 


=  VAR  _^_  EXPR 

=  ELEM 

=  EXPR  ap_  ELEM 

=  id 

=  number 


Semantic  Actions: 


1  Generate  POP 


2  Pop  operator  stack  and  generate 
SADD,  SSUB,  SMUL,  SDIV  appropriately 

3  Generate  PUSH  #id,  FETCH 
h   Generate  PUSH  number 


5  Generate  PUSH  #id 

6  Push  op  code  onto  operator  stack 

Grammar  1  with  the  semantics  of  Grammar  IE 
embedded  directly  in  the  modified  LR  table. 


Figure  5. 
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FETCH;   furthermore,  the  basic  action  generate  PUSH  #id  is  repeated 
in  semantic  action  5.   A  question  that  naturally  arises  is  how  to 
indicate  that  several  basic  actions  are  to  be  performed  togethei 
without  wasteful  duplication  of  code. 

One  way  to  attack  this  problem  is  to  embed  in  the  LR  table 
entry  not  the  actual  semantic  action  number  but  a  pointer  to  a 
linked  list  of  basic  semantic  action  numbers.   The  result  is  a 
configuration  such  as  the  one  shown  in  Figure  6. 

For  instance,  in  Figure  5  if  the  semantic  items  in  the  LR  table 
are  viewed  as  pointers  into  a  semantic  table,  and  again  the  semantic 
actions  are  renumbered,  the  resulting  semantic  table  would  be  as 
given  in  Figure  7.   If  the  parser  is  in  state  TO  looking  at  incoming 
token  id,  id_  is  shifted,  basic  semantic  action  2  is  performed,  and 
the  parser  goes  to  state  Tl.   If  the  parser  is  in  state  T5  looking 
at  incoming  token  op_,  reduction  by  rule  h   is  done,  and  basic  semantic 
actions  2  and  3  are  both  performed.   In  this  way  the  code  for  an  action 
that  should  be  done  at  many  different  points  in  the  parse  need  not  be 
duplicated o 

Eliminating  Type  2_  Stratifications 

The  second  type  of  stratification  is  a  result  of  semantic  actions 
at  a  reduction  depending  on  more  contextual  information  than  is  supplied 
by  the  production  rule  number  alone.   To  choose  the  appropriate  semantic 
action  in  such  a  case,  it  is  necessary  to  also  know  part  of  the  previous 
sequence  of  parser  actions  that  lead  to  the  current  configuration.   An 
important  element  in  the  solution  to  this  problem  is  the  fact  that, 
given  a  state  of  an  LR  table,  it  is  known  locally  how  much  of  which 
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Parsing  Table  Entry 


Semantic  Action  Table 


type  of 

parsing 

semantic 

parsing 

action 

action 

action 

number 

pointer 

basic  action  1 

•— 

basic  action  2 

•— 

• 
• 
• 

basic  action  n 

A 

The  configuration  of  an  LR  table  entry  and  its 
associated  linked  list  of  basic  semantic  actions. 

Figure  6. 


PARSE_TABLE  ( T  5 ,  op_) 


3* 


PARSE  TABLE(TO,id) 


5* 


BASIC 
ACTION 

#    LINK 


1 
2 

«►  3 
h 

«►  5 
6 
7 


1 

A 

6 

A 

2 

T 

h 

A 

2 

A 

5 

A 

3 

A 

BASIC  ACTIONS: 

1.  generate  POP 

2.  generate  PUSH  #id 

3.  generate  FETCH 

k.    generate  PUSH  number 

5.  push  op  code  onto  the 
operator  stack 

6.  pop  operator  stack  and 
generate  SADD,  SSUB, 
SMUL,  SDIV  appropriately 


SEMANTIC 
TABLE 


The  basic  semantic  actions  and  the  semantic  table  configuration 
when  the  semantic  items  of  Figure  5  are  viewed  as  pointers. 


Figure  7. 
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(TO,  A:=B+(C:=D/2)-E  )  ■+ 
(T0T1,  :=B+(C:=D/2)-E  )  ■* 
(T0T1T3,  B+(C:=D/2)-E  )  ■* 
(TOT1T3T6,  +(C:=D/2)-E  )  ■* 
(TOT1T3T5,  +(C:=D/2)-E  )  -*■ 
(T0T1T3TU,  +(C:=D/2)-E  )  -> 
(T0T1T3TUT10,  (C:=D/2)-E  )  -> 
(T0T1T3TUT9,  (C:=D/2)-E  )  + 
(T0T1T3TUT9T8,  C:=D/2)-E  )  ■+ 
(T0T1T3TUT9T8T2,  :=D/2)-E  )  ■> 
(T0T1T3TUT9T8T1,  :=D/2)-E  )  ■> 
(T0T1T3THT9T8T1T3,  D/2)-E  )  ■> 
(T0T1T3TUT9T8T1T3T5,  /2)-E  )  •> 
(T0T1T3TUT9T8T1T3TH,  /2)-E  )  ■* 
(T0T1T3TUT9T8T1T3TUT10,  2)-E  )  - 
(T0T1T3TUT9T8T1T3TUT9,  2)-E  )  ->• 
(T0T1T3TUT9T8T1T3TUT9TTs  )-e  )  - 
(T0T1T3TUT9T8T1T3TUT9T12,  )-e  ) 
(T0T1T3TUT9T8T1T3TU,  )-E  ) 


The  sequence  of  parser  configurations  vith  LR  table  of  Figure  3 
and  string  A:=B+(C:=D/2)-E  as  input. 

Figure  8. 
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production  rule(s)  has  already  been  scanned  and  exactly  what  possible 
items  are  anticipated  to  complete  the  rule(s).   For  example,  returning 
to  the  table  in  Figure  k,   when  the  parser  is  in  state  T3,  VAR  j^=  of 
production  rule  1  has  already  been  recognized,  and  the  parser  is 
anticipating  an  expression,  which  may  begin  with  any  of  several 
symbols.   When  in  state  T8,  the  parser  has  scanned  the  open  parenthesis 
of  an  embedded  assignment  and  is  expecting  the  assignment  itself.   The 
algorithm  for  constructing  an  LR  table  for  a  grammar  conveys  the 
essence  of  this  concept;  for  more  details  see  [AU72], 

Since  each  state  is  essentially  a  "snap  shot"  of  the  local 
environment  at  a  certain  point  of  the  parse,  knowing  the  sequence  of 
states  that  have  been  traversed  to  reach  the  present  point  is  equivalent 
to  having  a  history  of  the  scanned  symbols  that  are  pertinent  to  the 
current  environment  (i.e.  not  yet  resolved  in  a  reduction).   This 
history  contains  the  information  necessary  to  resolve  the  conflicting 
semantics  which  cause  type  2  stratifications,  and  is  exactly  the 
information  on  the  state  stack! 

For  instance,  using  Algorithm  1  and  the  LR  table  of  Figure  3  to 
parse  the  input  string  A:=B+(C :=D/2)-E,  the  parser  takes  on  the 
sequence  of  configurations  given  in  Figure  8.   The  next  parsing  action 
is  to  reduce  by  rule  1;   but  at  first  glance  there  appears  to  be  in- 
sufficient information  to  decide  whether  to  generate  a  POP  or  a  STORE. 
However,  utilizing  the  state  stack  data,  it  is  clear  that  the  only 
sentential  form  (if  used  as  an  "input"  sequence)  that  could  have 
produced  this  state  sequence  is: 


TO  ■*■   Tl  ■*  T3  -*■     TU  ->  T9  ->  T8  -*  Tl  ->  T3  -»■  TU 
VAR    :=   EXPR   OP     (    VAR    :=   EXPR 
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(This  can  easily  be  seen  from  the  LR  table:   the  only  way  to  go  from 
TO  to  Tl  is  by  shifting  a  VAR;   the  only  way  to  go  from  Tl  to  T3  is 
by  shifting  a  :=  ;   and  so  on.)   This  confirms  that  indeed  the  re- 
duction will  be  to  an  embedded  assignment;   thus  STORE  is  the  proper 
translation.   In  particular,  merely  knowing  that  state  T8  is  an  element 
of  the  state  sequence  is  sufficient  to  verify  that  the  assignment  is 
embedded  (since  T8  corresponds  to  having  scanned  the  open  parenthesis 
of  an  embedded  assignment),  and  to  generate  the  correct  translation 
it  is  necessary  only  to  see  if  T8  is  contained  in  the  state  stack. 
Resolving  a  choice  of  semantic  actions,  then,  depends  on  some 
condition  of  the  state  stack.   Incorporating  this  concept  into  the 
semantic  table  is  a  simple  matter;   some  encoding  of  the  desired 
condition  can  be  easily  added  to  each  element  of  the  table: 


encoding 
of  condition 


basic 
action  # 


link 


When  an  element  of  the  list  is  being  considered,  the  action  is 
performed  only  if  the  condition  is  found  to  hold  true,  then  the 
link  is  followed  to  the  next  action  as  before. 

Theoretically  the  state  conditions  might  become  extremely  complex, 
but  in  practice  only  a  few  basic  conditions  need  be  considered  useful. 
Some  typical  state  stack  conditions  might  be: 

-  Ti  appears  (or  doesn't  appear) 

-  Ti  and/or  Tj  appear  (or  not) 

-  Ti  and  Tj  appear  in  a  given  order 

Following  are  sample  encodings  of  each  of  these  conditions.   In  general, 
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the  encoding  scheme  that  should  he  used  is  the  one  which  supplies 
enough  power  (but  no  more! )  to  correctly  handle  the  semantics  of  the 
language . 


Encoding  1:   single  state  appearance, 


±  i 

semantic 
action  # 

link 

+  i 

-  i 

0 


Semantic  action  to  be  done: 

only  if  Ti  appears  on  the  state  stack 
only  if  Ti  does  not  appear  on  the  stack 
unconditionally 


This  encoding  will  be  used  in  the  solution  to  the  embedded 
assignment  problem  presented  later  (see  Figure  9). 


Encoding  2_ : 


multiple  state  appearance  with  one  logical  operator 
specified  implicitly. 


±  i 

±  J 

semantic 
action  § 

link 

+  i  +  J 
-  i  +  J 


Semantic  action  to  be  done: 

Ti  and (or)  TJ  appear  on  the  state  stack 
Ti  doesn't  appear  and (or)  Tj  does  appear 
and  so  on. 


This  encoding  also  encompasses  Encoding  1;   ±i  from  above  is  the 

same  as : 

±i  and  +0     since  TO  is  always  on  the  bottom  of  the  stack 

±i  or  +n      where  n  is  greater  than  the  largest  possible 
state  number,  since  that  number  could  never 
appear  on  the  stack. 


2h 


If  the  AND  operator  is  used  in  Encoding  2,  an  order/no  order 
bit  can  be  added  to  indicate  whether  or  not  the  states  need  be 
found  in  the  given  order. 

Encoding  _3:   multiple  state  appearance  and  two  logical  operators. 


l=and 
0=or 

±  i 

±  J 

semantic 
action  # 

link 

Except  for  the  and/or  bit,  this  is  the  same  as  Encoding  2. 


Integrating  the  mechanism  of  linked  lists  of  conditional 
semantic  actions  into  the  LR  table  provides  a  powerful  tool  for 
performing  complex  semantics  without  stratifying  the  grammar.   As 
a  final  example,  Figure  9  shows  the  LR  table  and  semantic  table 
for  unstratified  Grammar  U;   this  LR  table  represents  a  kk%   savings 
in  the  number  of  table  entries  over  the  table  for  the  stratified 
grammar  for  the  same  language  in  Figure  h    (or  an  80%  increase  in 
the  other  direction).   Notice  that  encoding  1  for  state-stack 
conditionals  is  used  since  it  is  powerful  enough  to  correctly 
perform  the  given  semantics.   Appendix  B  shows  a  sample  PL/l 
implementation  of  an  LR  parser  using  the  ESDTS. 
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id  numbr 


0£ 


ASSGN   EXPR   ELEM 


TO 

Tl 
T2 
T3 
Th 

T5 

T6 

TT 

T8 

T9 

TIO 

Til 

T12 


Sl/1 

S12 

S2 

S5 

s6 

ST 

S3 

Sk 

Rl/2 

S8/U 

Rl/2 

R2 

R2 

R2 

RV5 

RU/5 

RU/5 

R5/7 

R5/7 

R5/T 

Sl/1 

S9 

S5 

s6 

Sll 

SIO 

Sll 

R3/8 

R3/8 

R3/8 

R6 

R6 

R6 

A 

Grammar  k 

1  ASSIGN 

2  EXPR 
3 

1* 
5 
6 


ELEM 


S 


o  "^ 


(Y    cff  „<z>    o         y 


=  id  i^  EXPR 

=  ELEM 

=  EXPR  op_  ELM 

=  id 

=  number 

=  (  ASSIGN  ) 


Basic  Semantic  Actions: 

1  Generate  POP 

2  Generate  STORE 

3  Generate  PUSH  #id 
h   Generate  FETCH  ~ 

5  Generate  PUSH  number 

6  Push  op  code  onto  operator  stack 

7  Pop  operator  stack  and  generate 
SADD,  SSUB,  SMUL,  SDIV  appropriately 


0 

3 

A 

+7 

2 

3 

-7 

1 

A 

0 

6 

A 

0 

3 

6 

0 

h 

A 

0 

5 

A 

0 

7 

A 

SEMANTIC  TABLE 


Complete  example  of  ESDTS. 
Figure  9. 
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IV.   IMPLEMENTATION  CONSIDERATIONS 

Two  of  the  major  factors  to  be  considered  in  evaluating  the 
ESDTS  as  a  useful  translation  tool  are  its  use  of  resources  and  its 
compatibility  with  parsing  and  optimization  techniques. 

Resources 

The  savings  in  resources  obtained  by  using  the  ESDTS  is 
dependent  on  the  complexity  of  both  the  grammar  and  the  semantics 
to  be  performed.   In  deciding  whether  this  translation  scheme  would 
be  more  advantageous  to  use  than  normal  SDTS  for  a  particular  grammar, 
several  aspects  of  implementation  must  be  considered:   data  and 
program  memory  requirements,  and  execution  time. 

In  the  comparison  of  the  resources  required  in  the  ESDTS  and 
the  SDTS,  there  are  several  assumptions  made.   First,  the  analysis 
will  not  include  programming  time  or  the  logical  complexity  of  the 
different  schemes.   It  is  assumed  that  pointers  are  not  implemented 
as  the  general  system  concept  of  memory  addresses  (e.g.  2k   bits  on 
the  IBM  360/75)  but  are  essentially  integer  pointers  into  an  array, 
as  in  the  examples  of  the  previous  section.   Also  it  is  assumed  that 
there  exists  a  "case  selection"  construct  in  the  implementation 
language,  such  as  a  CASE  statement  in  PASCAL  or  a  branch  table  (a 
GOTO  via  an  array  of  labels)  in  PL/l. 
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Program  Spac  e 

A  driver  is  required  for  traversing  the  linked  lists  of  basic 
semantic  actions  and  for  decoding  and  checking  the  action  conditions. 
The  size  of  the  driver  is  dependent  on  the  complexity  of  the  state- 
stack  conditional  encoding  scheme  used.   However,  the  overhead  needed 
for  this  driver  is  balanced  by  the  saving  in  program  space  in  other 
areas,  depending  on  the  type  of  semantic  driver  used  in  the  old  scheme. 
For  example,  if  the  code  for  the  basic  semantics  was  repeated  where- 
ever  it  was  needed  (as  in  the  implementation  in  Appendix  A),  then  the 
space  saved  by  removing  duplicated  code  compensates  for  the  addition 
of  the  driver  routine.   If  a  method  for  performing  semantics  had  been 
used  where  the  basic  action  code  wasn't  repeated,  as  in  setting 
switches  to  control  the  selection  of  semantic  actions  or  using  a 
series  of  calls  to  basic  action  procedures,  then  converting  to  this 
new  scheme  may  still  represent  an  overall  savings  in  program  space 
since  the  code  that  previously  directed  the  performance  of  semantic 
actions  would  be  eliminated. 

The  code  necessary  to  decode  the  parse  table . entries  is  slightly 
more  complex  in  the  ESDTS,  but  the  basic  mechanism  is  already  present 
in  the  old  scheme  so  the  incremental  cost  is  small. 
Data  Space 

In  general,  to  obtain  a  useful  comparison  between  the  data  space 
requirements  for  the  ESDTS  and  the  normal  SDTS,  an  estimate  of  the 
number  of  bits  used  in  each  scheme  is  needed.   In  particular,  the 
following  discussion  will  be  based  on  the  IBM  360  and  assumes  that 
the  data  space  is  utilized  in  "chunks"  such  as  bytes,  half-words  or 
words  (32  bits). 
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The  semantic  table  containing  the  linked  lists  of  semantic 
actions  is  an  added  data  structure  in  the  ESDTS.   To  get  an  idea 
of  how  much  space  this  requires,  consider  a  reasonably  large 
example  where  there  are: 

-  250  states  (8  bits  +  sign  bit) 

-  125  basic  semantic  actions  (7  bits) 

-  250  list  entries  (8  bits) 

If  condition  encoding  scheme  1  is  used,  each  list  entry  can 
easily  be  packed  into  one  word  of  memory,  which  is  a  small  over- 
head compared  to  the  amount  of  memory  required  for  the  LR  table. 
On  the  other  hand,  the  semantic  driver  of  a  normal  SDTS  could  be 
implemented  using  linked  lists  of  basic  actions,  in  which  case 
changing  to  the  ESDTS  means  extra  memory  overhead  only  for  the 
state-stack  conditional  encoding. 

The  LR  table  is  the  largest  data  structure  and  is  where  the 
basic  motivation  for  saving  memory  space  originated,  so  it  is  the 
key  factor  in  the  analysis.   A  comparison  is  required  between  the 
number  of  bits  used  by  the  parse  table  when  the  grammar  has  been 
stratified  but  semantic  pointers  are  not  present  and  the  number  of 
bits  used  when  the  grammar  is  not  stratified  and  semantic  pointers 
are  present. 

Since  a  semantic  pointer  has  essentially  been  embedded  into 
each  entry  of  the  parse  table,  it  is  desirable  to  minimize  the 
number  of  bits  required  for  the  encoding  of  this  pointer.   One  way 
to  do  this  is  to  arrange  the  linked  lists  so  that  the  headers  of 
lists  (the  pointers  embedded  in  the  parse  table)  always  have  low 
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Semantic  Table 


Semantic 
Pointers 
from  LR 
table 


O  t?  .o 

<»  ,<y  o 
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Arrangement  to  minimize  LR  table  semantic  pointers. 

Figure  10. 
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indices  in  the  semantic  table,  and  the  remaining  elements  in  the 
lists  have  high  indices.   (See  Figure  10.)   For  example,  assuming 
there  are  127  basic  semantic  actions  and  31  lists,  by  arranging  the 
linked  lists  in  this  manner  the  semantic  pointer  in  the  parse  table 
need  only  be  5  bits  instead  of  8  bits. 

EXAMPLE  1 

Assumptions:   a)   data  space  allocated  in  half-words  (l6  bits) 

b)  30%  reduction  in  the  number  of  parse  table 
entries  due  to  non-stratification 

c)  parsing  action  (2  bits) 

250  states  (and  <250  production  rules)  (8  bits) 
30  semantic  lists  (5  bits) 

Analysis : 

Since  10  of  the  l6  bits  per  entry  would  have  already  been  used 

in  the  normal  SDTS,  adding  5  more  bits  for  the  semantic  pointers  does 

not  increase  memory  requirements.    Therefore,  an  actual  30%  reduction 

in  storage  space  for  the  parse  table  is  obtained,  and  even  considering 

the  space  required  for  the  semantic  table  this  would  be  a  strong 

candidate  for  the  ESDTS  implementation. 

EXAMPLE  2 

Assumptions:   a)   data  space  allocated  in  bytes  (8  bits) 

b)  30%  reduction  in  number  of  parse  table 
entries  due  to  non-stratification 

c)  parsing  action  (2  bits) 

510  states  (and  <510  production  rules)  (9  bits) 
125  semantic  lists  (7  bits) 

Analysis : 

A  parse  table  entry  of  the  old  scheme  (ll  bits)  fits  into  2 

bytes,  whereas  an  entry  in  the  ESDTS  table  requires  3  bytes.   Thus 

105%  of  the  previous  storage  requirement  is  needed.   If  this  scheme 

doesn't  produce  extra  power  or  savings  somewhere  else,  then  it  is 
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probably  not  a  good  candidate  for  this  translation  scheme.  However, 
assuming  b)  is  now  a  U0%  savings,  then  the  ESDTS  requires  90%  of  the 
old  scheme's  space  and  becomes  a  good  choice. 

Execution  Time 

The  difference  in  execution  time  in  the  two  schemes  is  centered 
mainly  in  the  performance  of  semantic  actions.   Depending  on  the 
complexity  of  the  condition  field  of  the  semantic  table  entries  and 
the  number  of  times  the  state  stack  is  searched,  execution  time  will 
generally  be  larger  than  in  the  normal  SDTS.   This  may  be  offset  by 
using  a  case-type  statement  (which  is  more  efficient  than  duplicating 
code  or  procedure  calls)  embedded  in  a  loop  along  with  the  code  neces- 
sary to  follow  the  linked  lists  of  the  semantic  table;   a  sample 
implementation  is  given  in  Appendix  B.   As  always,  there  is  a  trade- 
off between  power,  space,  time,  and  having  a  clean,  understandable 
implementation. 

Comp  at  ib  i  1  i  ty 

Since  the  ESDTS  is  a  means  of  optimizing  the.  LR  table,  it  is 
appropriate  to  discuss  this  scheme  in  terms  of  available  compaction 
techniques . 

The  ESDTS  is  compatible  with  any  table  compaction  method  which 
does  not  modify  the  internal  values  of  the  entries  but  simply  manipu- 
lates each  entry  as  a  whole. 

EXAMPLE  1_:   reordering  table  rows  and  columns  to  obtain  a  block  of 

"useless"  entries  (entries  that  cannot  be  reached  unless 
more  than  one  error  has  already  been  committed,  or,  if 
there  is  one  large  error-handling  routine  that  is  used 


32 


for  any  error  that  occurs,  entries  that  are  all  errors) 
The  table  can  now  be  divided  into  two  partial  tables, 
and  the  "useless"  block(s)  can  be  eliminated.   To 
illustrate,  the  reordered  LR  table  of  Grammar  h    (from 
Figure  6)  is  presented  in  Figure  11. 

EXAMPLE  2 :   implementing  the  table  as  a  sparse  matrix.   See  [KN68]. 

EXAMPLE  3_:   implementing  the  table  in  a  sequential  fashion.   Again 
the  LR  table  of  Grammar  h   is  used  to  illustrate.   See 
Figure  12. 


The  ESDTS  is  also  adaptable  to  other,  more  complex  optimization 
methods,  such  as  state  merging  and  elimination  of  non-essential  re- 
ductions.  However,  care  must  be  exercised  to  insure  that  semantic 
information  ends  up  in  the  "right  place"  in  the  table. 

One  last  comment  on  compatibility:   even  though  the  ESDTS  has 
been  presented  in  terms  of  LR  parsing,  it  is  suitable  (with  perhaps 
slight  modifications)  for  use  with  many  other  table-driven,  shift- 
reduce  parsing  techniques  such  as  LL  or  Floyd-Evans  methods. 
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OLD :  NEW :    _). 
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(Tl)  Tl' 

(T2)  T2»  » 
I 
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(T3)  T51 
(TU)  T6' 
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R6 
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Rl/2 


R2 
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R3/8 


R6 


EXPB 


Sl'/l 

511 f 

S2' 

ST' 

s8' 

S3' 

s6» 

S51 

Sl'/l 

S12' 

ST' 

s8' 

SIO' 

S9' 

I 
I 
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The  reordered  LR  ta"ble  of  Grammar  h, 
Fig\ire  11. 
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5 
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23 

9 

27 

10 

28 

11 

31 

12 
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13 
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Row  information 
(indicates  where 
each  state  starts) 
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8 

9 

10 

11 

12 
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id 

S12 

ASSIGN 

S2 

~ 

S5 

id 

s6 
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ST 

i 

S3 

EXPR 

S4 

ELEM 

Rl/2 

1 

S8/U 

22. 

Rl/2 

e 

R2 

I 

R2 

22 

30 
31 
32 
33 
3U 


R3/8 

e 

R6 

1 

R6 

°E 

R6 

e 

A 

e 

"Useful"    Column  information 
LR  table    (indicates  in  which 
entries      column  the  entry- 
appears  ) 


The  LR  table  of  Grammar  h   implemented 
in  sequential  fashion. 


Figure  12. 
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V.   SUMMARY 

The  ESDTS  presented  here  is  designed  to  allieviate  the  problem 
of  extremely  large  LR  tables.   In  particular,  the  ESDTS 

-  eliminates  stratifications  of  the  grammar,  resulting  in 
a  more  natural,  readable  grammar, 

-  allows  the  semantic  actions  to  be  parameterized,  thereby 
making  the  translator  easier  to  modify, 

-  allows  semantic  actions  to  be  performed  conditionally, 
taking  advantage  of  global  context  information, 

-  can  produce  a  large  savings  in  memory  requirements  in  the 
LR  table,  and 

-  is  compatible  with  other  optimization  techniques. 
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APPENDIX  A 
LR  PARSER  USING  SDTS. 

PREDEFINEC    CONSTANTS: 

MAXSTATES,     MAXRULES,      I*  A  X  V  OC  A6  (  =  #  T  F  R  X  I  K  AL  S4  »NOK  TE  RM  I  K  AL  S  )  . 
AND    SSTACK.SIZE 

mcscs  : 

push_itate(n)    :   pushes   n   cn   top  of    the   state   stack 

HITH  OVERFLOW  CHECKING 
POP  STATES<M)  :  POPS  M  ITEMS  OFF  Tt-E  STATE  STACK 

fclTH  UNDERFLOW  CHECKING 
PT  CECODE<X.Y.Z)  :  FRCM  X=VALUE  OF  LR  TABLE  ENTRY 

FINDS  Y=TYPE  OF  FARSING  ACTION 
Z=PARSING  ACTION  KUMEEfi 

fpccedures: 

get_token  :  lexical  analysis,  returns  encccing  of 

incoming  token  in  global  variable  token 

comments: 

thclgh  not  shohn  •  arrays  pt,  leftside.  anc  ri  c-htpartlen 

are  assumed  to  be  properly  initialized  */ 

sef:  frcc; 

CCL(  PTC i :maxstates» 1 :maxvocab) •    /*  LB  parse  TAELE  */ 

PT_ENTRY,ACT_TYPE»ACT_NUM.      /*  INDIVIDUAL  TABLE  ENTRY 

ANC  ITS  ELEMENTS  */ 
STATE  STACK! 1 ISSTACK  SIZE).     /*  STATE  STACK  */ 
LEFTSTDEU  :MAXRULES)7  /*  ENCODING  OF  NONTERMINALS 

CN  LEFTHANOSICE  OF  RULES  */ 
RIGHTPARTLEM  l'.MAXRULES  )•       /*  R IGH  THANCSIDE  RULE  LENGTHS  */ 

)  FIXED  BIN(IS). 
(  CASE(C:3)  IMT(ERR. SHIFT. REDLCE.  ACCEPT)  • 
SEMAC1 XMAXRULES)  )  LABEL. 

DONE  INIT(*OaE)  EIT(l); 
STATE  TOS=l ; 

state!stack<state_tcs )«o; 

CALL  GET  TOKEN; 

cc  »hile"~<-»cone)  ; 

FT_ENTRY=PT(STATE    ST  ACM  ST  ATE_TOS  ) .TOKEN)  ; 
PT    DECODECPT    ENTR Y . ACT_TYFE . ACT    NUM); 
GOTO    CASE(ACT    TYPE); 
ERR:     /*    ERROR    HANDLING    ROUTINE     */ 

GOTO  endcase; 
sfift:  push_state(act_num) ; 
call  get  token; 
goto  endcase; 
reduce:  pcp_states(RIGhtpartlencact_nlm) ) ; 
gctc  sema(act  num); 
sema(l):  /*  semantics  associated  *ith  rule  i  *• 

goto  endsema; 
sema(2):  /*  semantics  associated  *ith  rule  2  ♦/ 

GOTO  endsema; 


sema(maxrules  ):  /♦  semantics  associated  fcltf  last  rule  */ 

endsema: 

pt  entry*pt(state_stack(state  tos ) .lefts  ide (act_num  ))  ; 

pt~decode(pt  entry.act_type.act  num); 

if  act_tvpe=t  then  push_st  ate  (  ac*t_num  )  ; 

else  /♦  error  */  ; 
GCTO  endcase; 
accept:  put  skip  list* "valid  program  accepted*); 

return; 
endcase: 
i  end;  /*  dc  hhile  */ 
i parser; 
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APPENDIX  B 
LR  PARSER  USING  ESDTS, 

/»  PREDEFINED    CONSTANTS: 

MAXSTATES.     MAXVOCAB(=*TERMINALS+#NCNTEFMINALS) •     MAXRULES. 
SSTACK_SI2E.     MAXSEMA,     MAXSEMACTS 

macfos: 

push_state(n)  :  pushes  n  cn  top  of  state  stack 

*ith  overflow  checking 

POP  STATES(M)   :  POPS  *  ITEMS  OFF  TCF  CF  STJTE 

STACK  WITh  UNOERFLC*  CHECKING 
PT  CECODE (H.X.Y. 2)  :  FROM  W=VALUE  CF  LP  T*ELE  ENTRY. 

FINDS  X=PARSING  ACTICN  TYPE 

Y=PARSING  ACTION  NUMBER 
Z=SEMANTIC  FOINTER 
SEM  0ECODE(W,X,Y.Z)  :  FROM  *=VALUE  CF  SEMANTIC  TABLE  ENTRY 

FINDS  X=CONDITI0NAL  STATE* 

Y=EASIC  SEMANTIC  ACTICN  * 
2=LINK 

frccedures: 

get   token    z  lexical  analysis.   retlrns   encocinc   of    incoming 
token   in   c-l08al   variable   token 

comments: 

arrays  pt.  leftside.  f i gh tpartlen .  anc  sematae  are  assumed 
tc  ee  previously  initialized 

state_stack( 1 :sstack_si2e  )  is  assumed  to  ee  declarec 
fixed  bin<15)  globally   */ 

FAFSEF:  FRCCJ 

CCL(  PT( 1  IMAXSTATES.l  IMAXVOCAB),     /*  LF  FARSE  TAELE  ♦/ 

PT_ENTRY  )FIXED  BIN(21).         /*  INDIVIDUAL  TAELE  ENTRY  */ 
(  ACT_TYPE.ACT  NUM.SEM_PTR.       /*  TABLE  FIELDS  «• 

LEFTSIDE( 1 IMAXRULES)  •  •*  ENCODING  OF  NONTERMINALS 

CN  LEFTHANDSIDE  OF  RULES  */ 
RIGHTPARTLENI lCMAXRULES  ).        /*  FI GHTHAN CS IDE  RULE  LENGTHS  ♦/ 

)  FI XED  BIN (15), 
(  CASEKK2)   IMTIERR.  SHIFT.  REDUCE.  ACCEPT)  • 
SEMAO  :MAXRULES)  )  LABEL. 

DONE  INIT(»C«B)  EIT(l); 
STATE  TOS=i; 

STAlE_STACK(STATE_TOS )=0; 
CALL  get_token; 
co  *hile  <-«ccne); 

pt_entry=pt(state_stack(state_tos).token) ; 
pt_oecode<pt_entry,act_tyfe.act_num,sem_ptr); 
call  sema(sem_ptr ); 
gctc  case(act  type); 
err:  /*  error  handling  routine  */ 
gctc  enccase; 
shift:  pcsh_state<act_num)  ; 
call~get_tcken; 
gctc  endcase; 
reduce:  pcp_states<rightpartlen<act  num)); 

pt  entry=pt<state_stack(state  tos)  .leftside(act_num) ) ; 
pt~oecgde<pt_entry.act_tyfe,act_num.sem_ptr); 
if~act_type=1  then  oc; 

call  sema<sem_ptr); 
push_state ( act_num  )  ; 
end; 
else  /*  error  */  ; 

goto  endcase; 
accept:  put  skip  list<»valid  program  accepted*); 

return; 
endcase: 

enc;  /*  DC  WHILE  ♦/ 

end  parser; 
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APPENDIX  B   CONT'D 


SEkAZ      PPCC (LINK ) ; 

CCL    SEMATABC1 CMAXSEMA)     FIXED    BIN(31).     /*     SEMANTIC    TABLE    */ 
(LINK. STATE* .SNUM)     FIXED     BIN(15).     /*    TABLE     FIELDS     */ 
FCUND     eiT< 1 ) • 

sew_act<i :maxsemacts)  label; 
locf:    if  link=o  then  return; 

if(link>maxsena  |  link<1)  then  /*  error  */  ; 
sen_decode(sematab(link) .state*. s num. link); 

FCLNDs'O  »B; 

OC  I=STATE_TOS  TO  1  BY  -I ; 

IF    STATE_STACMI  )=AES (ST AT E# ) 
then    oc;    FCUND=M»E; 
goto  oone; 
end; 
enc; 

dcne:     IF  (STATE#>0  &    FOUND)  \     (STATEA<0  &  -FOUND) 

then  goto  sem_act ( snum) ; 
else  gotc  loop; 
£ek  act(l):  /*  easic  semantic  action  1  */ 

goto  loop; 
se»_act(2):  /*  easic  semantic  action  2  */ 
gotc  loop; 

• 
£EM_ACT( MAXSEMACTS):    /*  LAST  BASIC  SEMANTIC  ACTION  */ 

GOTC  loop; 

ENC  SENA; 
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